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ABSTRACT 

The efforts of a pair of curriculum developers to 
conduct "formative" and "summative" evaluation of an experimental 
civics course are described here, along with some of the consequences 
of their efforts, and a few of the pitfalls they encountered. 
Formative evaluation refers to those practices that produce data 
enabling developers to improve their products during the development 
stage. Summative evaluation refers to an over-all final evaluation of 
the product with the purpose of producing information useful to the 
ultimate consumers. Formative evaluation procedures described include 
pre- and posttesting of student political attitudes, objective 
testing of performance, open-ended teacher questionnaires, criticism 
of the course by outside readers, teacher de-briefing sessions, 
teaching one class by course developers, and site visits to pilot 
classes. Three instruments were constructed to provide a summative 
evaluation of the course: a political knowledge test, a political 
science skills test, and an attitude test. The report describes the 
pi ans for administering these tests as well as plans for evaluating 
the trained versus the untrained teachers and checking teacher and 
student response to the proposed instructional materials. (JY) 
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Product evaluation preaent. a number of aeriou. problems to curriculum davel- ||| 
opera, some that are not resolved by typical evaluation techniques. ScriWa 111 
argument that developers consider "formative" and "summative" evaluation stage, 
helps to clarify these problems and offers suggestions to deal with them, 
paper describe, the effort, by on. pair of developers to conduct "formative" and 
"stmnative" evaluation of an experimental civics course, some of the consequence, 
of their efforts, and a few of the pitfall, they encountered, the paper make, no 
attempt to contribute directly to a "theory" of curriculum evaluation. Quit, the 
contrary. By deacribing a real experience, it will become r«dily agate 

Iioip tfide the gsp between theory end practice really is. 

for the purpose of this paper "formative evaluation" refer, to those practice. 

that produce data enabling developer, to improve their product, during the develop- 
ment stage. "Summative evaluation" refers to an over-all final evaluation of the 
product with the purpose to produce information deemed useful to ultimate c nsum 
H hUe these two stages intersect and even overlap at points, it seems useful for 
analytical purpose, to think of course evaluation «• P«« ln * "***“*7 throu * h 






thsss two stages. 

The "product" referred to in this paper is a two-.sm.ster high school social 
.dene, course entitled "American Political Behavior" under develop*** at Indian. 
University's High School Curriculum Center in Government. The Government Center, 
funded by the Cooperative Research Branch of the U.S. Office of Eduction, was 
established in 1966 and is sponsored by the Department of Politicl Science and the 
School 0 f Eduction at Indiana University. "American Politicl Behavior," the 
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first course to be developed by the Center, underwent Initial pilot trials in 40 
schools during 1968-69 and is currently being used in a revised form in 49 differ- 
ent schools* It is Important for the purpose of this paper to make clear that 
the level of project funding has been adequate to support a small, professional 
and clerical staff but not sufficient to employ professional evaluators* There- 
fore, the evaluation to be described was planned and carried out by the project 
directors, the authors of this paper, and fully non-accredited amateur evaluators. ^ 

The basic questions that guided the evaluation activities described in this 
paper are: 

1* Can the course "American Political Behavior" be used successfully in the 
environments provided by typical schools? 

2. Can the course be taught as effectively by untrained as by trained teach- 



3. Are there any particular types of students for whom the course seems in- 
appropriate? 

4. Can students master the course content? 

5. Does the course represent valid political science knowledge and method? 

6. Does the course affect students’ political attitudes, values, and beliefs 
in socially desirable ways? 

7. Do teachers and students like the course? 

8 * ° f le880n# are moat likely to succeed and which are most likely 



Format ive Evaluation : 

As noted above, formative evaluation refers to those practices that produce 
data enabling developers to Improve their products during the development s tage , 
the following practices were undertaken in an effort to modify and to Improve the 
course ’’American Political Behavior": pre- and post-testing of student political 

attitudes; objective testing of student mastery of performance objectives; open- 
ended teacher questionnaires; criticism of the course by a panel of outside readers 



a meeting at the end of the first year with pilot teachers; teaching of one class 
by course developers; site visits to pilot classes with interviews of pilot teach- 
ers, students, and school administrators. In the paragraphs that follow, each 
technique will be described; reference will be made to questions the technique 
sought to answer; changes stimulated by the technique will be cited; and difficul- 
ties connected with each technique will be Indicated. 

Tests of mastery learning. The "American Political Behavior" course is con- 
structed to facilitate mastery learning, the attainment of performance objectives 
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by the majority of students in a particular group. A performance objective is a 

statement that indicates exactly what a student is able to do as a result of In- 
A 

structlon. 

Performance objectives are provided with each daily lesson plan in the teach- 
er's guide. Teachers know precisely the purposes of the lesson and can teach to 
accomplish them. An Important element of the instructional strategy is to provide 
numerous application lessons that enable students to apply knowledge and skills 
acquired in preceding lessons. 

At the end of each instructional sequence, on the average every two weeks, 
the teachers administered a multiple-choice type examination designed to measure 
the performance objectives of the material most recently taught. Bach item was 
designed to be a valid measure of one of the objectives. Therefore, theoretically, 
success on the item represented successful mastery of the objective and the mate- 
rial related to it. 

The tests of mastery learning were designed to reveal strengths and weaknesses 
in the instructional materials. For example, if most students responded correctly 
to a set of test items pertaining to a performance objective, we assumed that the 
Instructional materials constructed in terms of this performance objective were 
conmunlcating successfully to students. If most students responded incorrectly to 
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a set of test items pertaining to a performance objective, we assumed that either 
the pertinent instructional materials or the test items were flawed and in need of 
revision. In most instances, a pattern of incorrect student response across dif- 
ferent student groups indicated inadequacy of the Instructional material and prompted 
the redesign of particular parts of the course. 

It was hoped that the gathering of objective test data from all of the students 
would be the most powerful and efficient technique for formative evaluation. While 
it was helpful on several occasions, it was not worth the time, money, and energy 
given to it. The system was theoretically simple and seemed efficient. However, 
teachers failed to return tests promptly; some tests were lost; teachers frequently 
did not check to make certain that answers were recorded in correct places; and 
students failed to code their tests properly. The result was a gigantic snarl. 
Special assistants were hired to check individual answer sheets, and computer pro- 
grammers were hired to try to eliminate some errors by program. The result was an 
enormous headache and great strain on a limited budget. Probably, we could have 
accomplished as much by simply asking teachers to record class scores on individual 
test items. This simple information might have provided better data than we ulti- 
mately used. 

Teacher questionnaire . At the end of each instructional sequence, approximately 
ten days, pilot teachers were asked to complete a questionnaire we provided them. 

Bach two to three page questionnaire asked teachers specific questions about indi- 
vidual lessons. It also provided an opportunity for each teacher to comment at 
length about the course. 

The questionnaires frequently were the source of useful tips. We found ideas 
for the way lessons might be restructured. When the questionnaires revealed that 
most teachers were having a similar difficulty with a particular segment of the 
course, we concluded that this portion of the instructional materials probably 









needed revision. 
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Panel of outside readers * Two types of readers were used: political science 

scholars who are specialists in political behavior and specialists in social studies 
education. The former were used to provide validation of political science content 
and method in the course; the latter checked us on pedagogical strategies, sequenc- 
ing of lessons, etc. 

Outside readers were used at two different stages. Early drafts of units were 
sent to readers when the developers were treating concepts that presented special 
problems for them. When the pilot version of the course was completed, the entire 
course was read by one political scientist and one social studies specialist who 
wrote extensive critiques of the material. 

The assistance of outside readers was simple to arrange, relatively cheap, and 
produced excellent results. Ideas for presenting the material were acquired, and 
some material was entirely rewritten on the basis of the outside assessments. For 
example, a section on the influence of personality on political behavior was judged 
particularly weak and has been rewritten to bring it into line with current schol- 
arly views. 

End -of -year meeting . In June, 1969, we met approximately one-half of the pilot 
teachers at a three-day meeting in Bloomington. The purpose of the meeting was to 
de-brief the teachers on the basis of their experience teaching the "American Po- 
litical Behavior" course during the 1968-69 academic year. All of these teachers 
had been trained in a seven-week institute during summer, 1968 prior to teaching 
the course. The purpose of the summer institute had been less to train them to 
teach the course than to train them to be critics of the course. In short, they 
had been trained to become partners in formative evaluation. 

At the June meeting, discussion ranged over all elements of the course. The 
sessions were tape-recorded in order that specific sessions might be replayed if 
necessary. The session proved to be very valuable, not because it turned up new 
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problems that had not been recognised earlier, but It tended to confirm the con- 
clusions reached by other evaluation techniques. It was particularly useful to 
have many teachers present to discuss the course, however, because the complaint 
of a single teacher often turned out to be less serious than originally believed 
when It was played out among all the teachers present. 

Teachers were particularly warm in their praise of case studies, slide-tape 
lessons, and the few simulation-games we had provided. Enthusiasm by teachers for 
the lesson plans we had devised strengthened our resolve to keep them. 

Developers * class . Probably the most useful and simple formative evaluation 
practice is for developers to teach students who are using the experimental course. 
We gained the permission of local school authorities to establish one section of 
ninth-graders in a local high school who were our responsibility throughout the 
school year. By teaching the course, we became iustantly aware of serious problems 
we could repair immediately, without awaiting feedback from other teachers. We were 
able to make judgments about the readability of the material, pacing, sequencing, 
etc. When students seemed to lose interest in the course, we were the first to know 
and were under direct pressure to do something about it. 

The principal drawback we found in teaching our own class was the drain on 
energy and time. When we were meeting our students, we were unable to travel to 
observe pilot teachers. And we bad less time to write. Therefore, this type of 
evaluation is expensive but probably worth the cost. 

Site visits . We were able to visit 30 of the 40 pilot teachers during the 
first year. When one adds the time required to travel, it is apparent that nearly 
one-third of the 180-day school year was spent in the field visiting the pilot 
schools. Site visits are demanding. We talked to the principals, the teachers, 
and the pilot .students at least. Frequently, we were asked to meet other adminis- 
trators and to speak to the social studies faoulty. 



Despite the high cost in travel money, time lost, and energy expended, site 
visits are absolutely essential to the developer. The best way to learn how a 
course is being taught in a typical classroom is to visit one. Rarely was our 
course taught exactly as we had conceived it; occasionally it turned out much bet- 
ter than we had imagined it could be; often it was far worse. We found the princi- 
pal was usually an excellent Informant regarding how the course was perceived by 
the community at large. The students often provided data leading to conclusions 
that deviated from those derived from test data. It was clear, for example, that 
students frequently had learned more from the course than test scores had Indicated. 
We learned that in our effort to measure "higher levels" in the Bloom taxonomy, 
the items became so complex that they were missed because the student could not 
make sense of the test question. Oral questioning of the students tended to in- 
crease our confidence in the course and decrease our confidence in some of the ob- 
jective test items. 

However, site visits tended to support over-all impressions of test data. 

Where the course was being used with students of low scholastic attainment with 
limited reading ability, the course was failing. Not surprisingly the course had 
the greatest success among the highly gifted, academically-inclined students. On 
the other hand, the course was not only a course for academically able youngsters. 

It was being mastered by typical ninth-grade youngsters who were reading at elghth- 
or ninth-grade reading level. 

Test of political attitudes . American schools offer courses in civics and 
government not only because they wish to Impart political information, but they 
also hope to influence students to hold "positive" political values. It is unlikely 
that any civics course would be accepted by the schools that undermined the attain- 
ment by students of socially prescribed "fundamental, American political values." 
While "American Political Behavior," unlike typical civics and government courses, 
makes no attempt to preach these values, it certainly intends to support them. 
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As we were anxious primarily to learn of any "negative 11 Impact the course 
might have on student political attitudes during the formative evaluation stage, 
we administered a political attitude instrument as a pre- and post-test to all 
students taking the pilot course. This political attitude instrument consisted of 
six sets of Likert-scaled items designed to measure political tolerance, sense of 
political efficacy, political interest, political trust, support of majority rule 
practices, and support of political pluralism. This political attitude instrument 
was used to provide a rough indication of whether or not* the course might have a 
"negative" Impact on political attitudes of students. As a whole, the student 
performance on the political attitude instrument indicated a very slight movement 
in a "positive" direction on each set of items except the political interest set. 
Bare students showed a very slight decline in political interest. However, as a 
result of this part of the formative evaluation, we felt no need to massively re- 
vise the course for the purpose of reinforcing or creating support for basic demo- 
cratic political ideals. 

Suamative Evaluation : 

The purpose of summatlve evaluation is to provide educational decision-makers 
with evidence about the worth of an educational product, in this instance the 
"American Political Behavior" course. Before deciding to adopt a course of study, 
school teachers and administrators should know how the new course performs in 
terms of particular criteria and how the new course compares with similar products. 
In order to provide evidence about the worth of a course of study, an evaluator at 
least must: 1) construct Instruments to measure changes in students' behavior 

toward particular Instructional objectives; and 2) administer these evaluational 
instruments to randomly assigned student groups who have and who have not experi- 
enced the experimental Instructional materials. 
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Three instruments have been constructed to measure the impact of the "American 
Politi c al Behavior" course on students. A political knowledge test and a political 
science skills test have been developed to measure student performance in terms of 
knowledge and cognitive objectives of the course. An attitude test has been de- 
veloped to measure the effect of the course on student political attitudes relating 

to democratic ideals. 

An evaluational instrument that measures knowledge and skill outcomes of in- 
struction must satisfy three basic requirements in order to be valid. First, in 
order to be a valid test of the relationship of student learning and a course of 
Instruction, test items must fit course objectives. This match between test items 
and objectives of instruction is the major contributor to the validity of an instru- 
ment designed to measure instructional materials. Second, experts must agree on 
the "right" or "best" answer to each item, if the test is to be considered valid. 

And third, most students who have not experienced the experimental instructional 
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materials must not be able to respond correctly to the test items. 

We need to measure changes in student behavior in order to measure what stu- 
dents have learned as a result of experiencing a particular type of instruction. 
Tests designed to discriminate Individual differences in performance among students 
do not produce evidence from which one can infer rigorously the relationship of a 
particular type of instruction to learning. Thus, the standard type of item anal- 
ysis used in test development does not apply to the development of tests to measure 
mastery learning. For example, the standard type of item analysis, for the purpose 
of building tests which measure individual differences, requires the elimination 
of test items which most students answer correctly or incorrectly. Such items do 
not discriminate among individual learners. In contrast, the development of tests 
to measure mastery learning requires the elimination of items which most students 
answer correctly prior to a particular type of instruction and the retention of 
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items which most students answer incorrectly prior to instruction. The aim is to 
build a test which can measure changes in student performance related to particular 

Instruction.* 

In order to build valid political knowledge and political science skills tests 
for use in summative evaluation, we first constructed items that we believe fit 
our instructional objectives. Next, we sought the aid of political scientists to 
judge the items, to certify content validity. Then, we administered the tests to 
students who had not experienced our course in order to determine which items to 
retain for use in summative evaluation. Items which more than one-half of the 
"pilot test" students answered correctly were dropped from the instrument, as it 
was presumed that these items could not help us to measure changes in student per- 
formance that were related to experiencing the "American Political Behavior" course. 

In order to validly use our tests of knowledge and skills comparatively, to 
measure relative performance of groups who have and who have not experienced the 
"American Political Behavior" course, we wrote items that do not contain jargon 
peculiar to our course. Students who have not experienced the experimental course 
should not find it more difficult than students who have experienced the course to 
read our test items prepared for the summative evaluation. As the tests are free 
of special terminology, they are more likely to yield differences in understanding 
and knowledge between different groups of students. 

The attitude test consists of eight sets of Likert-type items. These eight 
sets of items, or scales, are designed to measure the following attitudes: polit- 

ical tolerance, sense of political efficacy, political Interest, support for major- 
ity rule practices, support for political pluralism, political trust, support for 
practices that equalise opportunities among different socio-economic groups, and 
support for major institutions of the national government. Collectively these 
eight sets of items have been devised to yield a rough measure of student support 
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for "democratic " political practices and basic political institutions of our nation. 

Construct validity for the attitude scale has been established through analy- 
sis of inter-item correlations. Through this device, the Internal consistency of 
each set of items was established. Items that did not appear to fit, in terms of 
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student responses, with others in a set were dropped from the attitude test. 

This spring (1970) we plan to administer the tests of political knowledge , 
political science skills > and political attitudes to "experimental" and "control" 
groups in fifteen school systems in five geographical regions. In each case we 
can claim random assignment of students to an "experimental" group who are taking 
the "American Political Behavior" course and to a "control" group who are taking 
another social studies course. The modal grade level of students Involved in the 
summative evaluation is ninth grade, but eighth, tenth, and twelfth grade groups 
are also present. The student groups represent different socio-economic and ethnic 
groups. For example, twelfth-graders in the predominately black, inner-city com- 
munity of Atlanta, Georgia are Included in this field trial as are twelfth-grade, 
middle-class, white students from Eugene, Oregon. Small-town, white, ninth-graders 
from Mount Vernon, Indiana and white, ninth-graders from the Kansas City metro- 
politan area are participating in this evaluation of the "product." These examples 
provide a picture of the range of types of student groups Involved in this summative 
evaluation. 

The random assignment of students enables us to claim that the characteristics 

of the "experimental" and "control" groups are comparable. Thus, we can employ a 
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"post-test only" research design. 

In four evaluation sites, the classroom group is the unit of analysis of test 
results. In each of these situations we have four or five "experimental" groups 
to be compared with four or five "control" groups. In situations where multiple 
"experimental" groups can be established, several evaluation experts argue that for 
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curriculum evaluation the classroom group, rather than the Individual students who 

Q 

make up the group, is the most useful unit of analysis. In eleven evaluation 
sites, we are forced to use the student as the unit of analysis, as we were unable 
to establish more than one "experimental 1 ' and one "control" group. 

Through analysis of variance of scores on the political 'knowledge and political 
eoienoe ekille tests we hope to be able to claim that students who have experienced 
our course have achieved its basic learning outcomes as specified in performance 
objectives, and that students in "control" groups have not achieved these outcomes. 
As other social studies courses, including other civics courses, do not share the 
knowledge and skill objectives of the "American Political Behavior" course, we sre 
not directly comparing our product with a competing product. In fact, there is no 
directly competing product, as other civics and government courses represent a 
legalistic -normative approach to the study of government rather than a social sci- 
ence spproach to the study of political activity. 

We hop* to be able to present evidence to educational decision-makers that 
our course does communicate effectively to students and that relatively permanent 
changes in student capabilities have occurred as a result of experiencing the 
"American Political Behavior" course. Educational decision-makers who value these 
kinds of changes — who value the objectives of the "American Political Behavior" 
course — are then in a position to decide to utilize the new program. However, 
educational decision-makers who do not value the kinds of learning outcomes that 
the new course may effect should not employ the course, even if our summatlve evalu- 
ation indicates that it is an effective product. 

Through chi-square and correlational analysis of scores on each of the eight 
attitude scales, we hope to be able to claim, at least, that students who have ex- 
perienced the "American Political Behavior" course are no more likely to express 
"negative," or "antl-democratlc," political attitudes than are students who have 
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not experienced the experimental civics course* We would be delighted to be able 
to claim that our course is related to increased student expression of "democratic" 
and "positive" political attitudes. And we believe that the new course is likely 
to reinforce "positive" political attitudes Hat students bring to the course. How- 
ever, the performance objectives of our course pertain primarily to cognitive out- 
comes, to knowledge and skill learnings, not to political attitude outcomes. Thus, 
we do not anticipate a massive reorientation of student political attitudes to re- 
sult from experiencing the "American Political Behavior" course. 

An additional aspect of the sumr.ative evaluation research design calls for the 
comparison of the instruction of "trained" and "untrained" teachers using the 
"Merlcan Political Behavior" course. ’’Trained" teachers are those who have experi- 
enced, prior to teaching the new course, a special seven-week summer institute 
taught by the course developers. "Untrained" teachers arc those who have not ex- 
perienced special, intensive instruction prior to teaching the new civics course. 

The "untrained" teachers have received only written Instructions about how to teach 
the new civics course in a teacher's guide, and they have been given two "position 
papers" that describe the instructional materials and provide a rationale for the 
uae of the new program. 

Among the fifteen school systems involved in the cumulative evaluation, fifteen 
"untrained" teachers and nine "trained" teachers are using the "American Political 
Behavior" course. In three of these school systems, both "trained" and "untrained" 
teachers are involved in the summatlve evaluation. We hope that the students of 
"untrained" teachers perform as well on our instruments of evaluation, relative to 
their control groups, as the students of "trained" teachers. If this occurs, we can 
claim that special, intensive instruction is not necessary to prepare a teacher to 
use the new civics course. 

A final feature of our summatlve evaluation involves the evaluation of the 
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Instructional natstlAls by students end teachers who have used the revised version 
of the materials In field trials. Both students and teachers In each of the pilot 
schools who are using the "American Political Behavior" material in field trials 
will be asked to respond , at the end of the school year, to questions designed to 
reveal their beliefs about the interest level, relevance, and over-all utility of 
the new civics course relative to other social studies courses that they have taken. 
These students and teachers will be asked also to respond positively or negatively 
to a check list of basic features of the new civics course. These responses can 
provide a roug h indication of student and teacher affect for the new instructional 
program. 

Several difficulties and/or limitations connected with the use of this summatlve 
evaluation research design must be indicated. Some of these limitations are mini- 
mised through the conduct of several simultaneous, experimental field trials under 
different conditions. This serves to diminish several possible alternative explana- 
tions for the Impact of the new course on students that might loom large if the 
evaluat io n were conducted only under one set of conditions or only at one site. 

And it serves to extend the generalisability of our findings. 

We face the possibility that several factors other than the instructional mate- 
rials could account for any successes that are uncovered during the summatlve evalu- 
ations. Factors such as the pedagogical skill or enthusiasm of the teacher, par- 
ticular learning conditions, the unusual skill or enthusiasm of the students, or 
comnity influences could be as Important, or more important, than the Instructional 
materials in accounting for successful student performances. However, if each of 
several, simultaneously conducted field trials produces favorable results under 
various conditions, our confidence in making claims about the utility of the "Ameri- 
can Political Behavior" course will be increased greatly. 

Another difficulty connected with summatlve evaluation is the establishment of 
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randomly assigned control groups and experimental groups. We were able to estab- 
lish this condition in only fifteen of the 49 schools that are involved in the 
field trial of the "American Political Behavior" course. 

Still another difficulty Involves the need to obtain accurate information 
about the prior curricula experience of students involved in the evaluation and 
about special or unusual conditions affecting the learning environment. For exam- 
pie, students in one of our experimental sites have been Involved in field trials 
of the anthropology and geography project materials prior to experiencing our course. 
This prior experience is likely to affect their performance in our course. This 
kind of Information about the context of each field trial is necessary in order to 
Interpret satisfactorily the findings of the summatlve evaluation. 

Because of the large number of variables and because of the many difficulties 
involved in conducting summatlve evaluation, we cannot be certain that successful 
student performance results directly and entirely from experiencing the "American 
Political Behavior" course. But through the use of the summatlve evaluation pro- 
cedures described in this paper, we can claim that particular students do, or do 
not, attain specific learning outcomes that are integrally involved in the new course. 
And we increase the probability that our claims about the efficacy and/or weaknesses 
of the course are accurate. These results of summatlve evaluation provide educa- 
tional decision-makers with grounds for deciding whether or not to utilise the new 
instructional materials. 

This description of the "formative" and "summative" evaluation of the "product" 
of a social studies curriculum project reveals some of the pitfalls, limitations, 
and fruitful possibilities involved in this two-stage evaluation process. Hopefully, 
this recounting can serve others who are Interested in the challenges of instruc- 
tional materials development. 



o 




FOOTNOTES 



1 

Scriven, Michael. "The Methodology of Evaluation." In Perapectlvee of Cur* 
rlculum Evaluation. AERA Monograph Series on Curriculum Evaluation. Chicago: Rand 

kcNally and Company, 1967, pp. 39-83. 

%he following articles provide descriptions of the course in "American Polit- 
ical Behavior" and the assumptions of the course developers. Mehllnger, Howard. 

The Study of American Political Behavior . Bloomington: Indiana University, 1967, 

unpublished paper; Patrick, John J. "Teaching High School Students about American 
Political Behavior." The North Central Association Quarterl y 43:234-242, Fall, 
1968. 

^Bloom, Benjamin S. "Learning for Mastery." UCLA-CSEIP. Evaluation Comment, 
May, 1968. 

^Gagne, Robert M. "Curriculum Research and the Promotion of Learning." In 
Perspectives of Curriculum Evaluation . AERA Monograph Series on Curriculum Evalua- 
tion! Chicago: Rand McNally and Company, 1967, p. 21. 

^Scriven, oj>. cit ., pp. 45-60; Wlttrock, M.C. "The Evaluation of Instruction: 
Cause and Effect Relations in Naturalistic Data." UCLA-CSEIP. Evaluation Comment , 
May, 1969. 

^Ibld . . pp. 4-5. 

7 0ppenhelm, A.N. Questionnaire Design and Attitude Measurement . New York: 
Basic Books, Inc., 1966, pp. 133-143. 

Bpopham, W. James. Simplified Designs for School Research , Inglewood, Cali- 
fornia: Southwest Regional Laboratory for Educational Research and Development, 

October, 1967; Kerlinger, Fred N. Foundations of Behavioral Research . New York: 
Bolt, Rinehart and Winston, Inc., 1966, pp. 301-321. 

^Popham, oj>. cit. , p. 11; Baker, Robert L. "Curriculum Evaluation." Review 
of Educational Research 39:341, June, 1969. 



